The source for this notebook can be found at https://github.com/NASA-NAVO/servicemon/blob/main/servicemon/analysis/notebooks/ExplorePerformanceData.ipynb.
NAVO has started regularly querying some TAP and Cone Search services to collect data on their response times. So far this is mostly NAVO services, but also includes a CDS 2MASS cone search for comparison. (Some Chandra Source Catalog queries are also done, but due to sparse sky coverage these need to be adjusted.)
The queries are done using the servicemon application (https://servicemon.readthedocs.io/en/latest/), and are executed from several different locations. The AWS instrumentation is handled with the software at https://github.com/NASA-NAVO/AWS_servicemon. The results are written to a TAP-accessible database currently running at IPAC.
Now that all can examine the monitoring data and run additional tests, all can contribute:
Analyzing response data.
Developing other plots, analysis or alerts.
Maintaining the operational monitoring.
Monitoring parameters
servicemon and AWS_servicemon.Short term:
Longer term:
All of the parameters of the queries are configurable, but below is what is currently running. TAP queries now are all async.
| base_name | service_type |
|---|---|
| CDS_2MASS | cone |
| Chandra_CSC | cone |
| Chandra_CSC | tap |
| HEASARC_swiftmastr | cone |
| HEASARC_swiftmastr | tap |
| HEASARC_xmmssc | cone |
| HEASARC_xmmssc | tap |
| IPAC_2MASS | cone |
| IPAC_2MASS | tap |
| IPAC_WISE | cone |
| IPAC_WISE | tap |
| NED_NED | cone |
| NED_NED | tap |
| STScI_2MASS | cone |
| STScI_PanSTARRS | tap |
| STScI_PanSTARRS | xcone |
| STScI_ObsTAP | tap |
| STScI_WISE | cone |
A set of 10 random cone queries, with radii ranging from 0 to 0.25 degrees, is run for each service every 6 hours. The exact hours are staggered by location.
We should change this to include (or only use) fixed cones, so that we can compare the exact same queries over time. (servicemon can be run with fixed or random targets.)
The queries are run from the following AWS regions:
'ap-northeast-1', 'ap-southeast-2', 'eu-west-3', 'sa-east-1', 'us-east-1','us-west-2'
Due to testing, the database may also contain scattered results from other locations.
The TAP service at http://navo01.ipac.caltech.edu/TAP has a table called navostats2 with one row per query run by servicemon. This table contains data starting on about April 6, 2021.
For legacy data from February and March 2021, there is also an older table called navostats which contains results from Feb 2, 2021 to Mar 27, 2021, with slightly different column names as detailed in https://github.com/NASA-NAVO/servicemon/issues/47.
For more please see https://nasa-navo.github.io/notebooks/ExplorePerformanceData_original_columns.html
Note: The VOSI endpoints have not yet been implemented for this service, so PyVO and Topcat will complain during metadata gathering, but both both PyVO and Topcat can be used to query this service, and all the TAP_SCHEMA tables are implemented, so those can be used to query metadata.
The following columns are available:
| column_name | datatype | format | description |
|---|---|---|---|
ra |
double | 20.6f | Right Ascension of the query cone region. |
dec |
double | 20.6f | Declination of the query cone region. |
sr |
double | 20.6f | Radius of the query cone region (deg). |
adql |
char | 300s | For TAP queries this is the full ADQL query that was done. Empty for non-TAP queries. |
| column_name | datatype | format | description |
|---|---|---|---|
access_url |
char 300s | The base URL of the service. | |
base_name |
char | 20s | A short name of the service given by the servicemon configuration files. Not yet consistent for all services. |
service_type |
char | 20s | While other values are possible, the main service types we're tracking now are tap, cone, and xcone which is like cone, but not VO-compliant. |
location |
char | 80s | Self-declared location of the monitoring service (e.g., AWS region). |
start_time |
char | 30s | The data and time that the query was started (format='%Y-%m-%d %H:%M:%S.%f'). |
end_time |
char | 30s | The data and time that the query was completed (format='%Y-%m-%d %H:%M:%S.%f'). |
Note that these values may empty for certain types of query failures.
| column_name | datatype | format | description |
|---|---|---|---|
do_query_dur |
double | 20.6f | Time to an HTTP response indicating that the query is complete, but prior to the results being streamed back to the client. |
stream_to_file_dur |
double | 20.6f | Time to download the the results after the HTTP response indicating that the query was complete. |
query_total_dur |
double | 20.6f | Total time from query start to query end including download time. |
extra_dur0_name |
char | 20s | "tap_submit" for async tap results, null otherwise. |
extra_dur0_value |
double | 20.6f | Duration of submitting the TAP submit request for async tap results, null otherwise. |
extra_dur1_name |
char | 20s | "tap_run" for async tap results, null otherwise. |
extra_dur1_value |
double | 20.6f | Duration of submitting the TAP run request for async tap results, null otherwise. |
extra_dur2_name |
char | 20s | "tap_wait" for async tap results, null otherwise. |
extra_dur2_value |
double | 20.6f | Duration of submitting and waiting for the TAP wait query for async tap results, null otherwise. |
extra_dur3_name |
char | 20s | "tap_raise_if_error" for async tap results, null otherwise. |
extra_dur3_value |
double | 20.6f | Duration of calling the pyvo AsyncTAPJob.raise_if_error() function for async tap results, null otherwise. |
extra_dur4_name |
char | 20s | "tap_fetch_response" for async tap results, null otherwise. |
extra_dur4_value |
double | 20.6f | Duration of calling the pyvo AsyncTAPJob.fetch_result() function for async tap results (does not include the time to actually retrieve the data and save it in a file), null otherwise. |
| column_name | datatype | format | description |
|---|---|---|---|
num_columns |
integer | 9d | Number of FIELDs in the result VOTable. |
num_rows |
integer | 9d | Number of rows in the result VOTable. |
size |
integer | 10d | Size of the result VOTable (bytes). |
from bokeh.plotting import output_file, output_notebook, show, reset_output
from servicemon.analysis.stat_queries import StatQueries
from servicemon.analysis.basic_plotting import create_service_plots, create_source, create_plot_location_shapes
The class and functions described in this API document support making queries, converting our query results to pandas, then plotting some sample plots using bokeh, both in a notebook and on a web page. That API is used by the code below.
sq = StatQueries()
services = sq.get_name_service_pairs()
create_service_plots(sq, services, start_time='2021-04-22', end_time='2021-04-25')
reset_output()
output_notebook()
sq = StatQueries()
query = """
select * from navostats2
where location in (
'ap-northeast-1',
'ap-southeast-2',
'eu-west-3',
'sa-east-1',
'us-east-1',
'us-west-2'
)
"""
data = sq.do_query(query)
source = create_source(data)
plot = create_plot_location_shapes(source)
show(plot)